Skip to content

feat: xarm manipulator adapter improvement#2425

Open
TomCC7 wants to merge 7 commits into
mainfrom
cc/feat/xarm-manipulator-updates
Open

feat: xarm manipulator adapter improvement#2425
TomCC7 wants to merge 7 commits into
mainfrom
cc/feat/xarm-manipulator-updates

Conversation

@TomCC7

@TomCC7 TomCC7 commented Jun 8, 2026

Copy link
Copy Markdown
Member

Internal version of #2353 , closing the original pr.

Closes #1183

Solution

This PR resolves the following issues observed on teleoperating xarm:

KeyboardTeleopModule initial position sync

Wire robot joint states to the keyboard teleop module so that it initialize target to the robot startup position. A more proper fix in my mind should be to refactor keyboard teleop module to only output relative cartesian motion and wire that to some relative cartesian task so that the teleop module is decoupled from the actual robot state.

XArm graceful start/stop

Added two lifecycle methods in manipulator adapter activate/deactivate to execute functions required before robot starts/stop movement. The semantic is different from connect/disconnect in that sometime you might want to only pause robot commanding while still keeping the connection. Now the xarm will move to the default position on start/stop.

How to Test

# connects to a real robot for testing
dimos --xarm6-ip=192.168.1.210 run keyboard-teleop-xarm6

Contributor License Agreement

I have read and approved the CLA.

TomCC7 and others added 5 commits June 3, 2026 14:35
- Fix _XARM6/7_INITIAL_JOINTS to use degrees instead of radians
- Add motion_enable(False) before set_state(4) in stop()
- Update custom arm docs with activate/deactivate lifecycle methods
- Ignore .omo/ directory
Remove unnecessary getattr/callable/hasattr guards since
ManipulatorAdapter Protocol guarantees these methods exist.
- Guard activate()/deactivate() calls in the coordinator so adapters
  without lifecycle methods (twist bases, whole-body) no longer raise
  AttributeError; restore the write_enable(True) fallback on setup
- Implement activate()/deactivate() in MockAdapter and ShmMujocoAdapter
  to satisfy the extended ManipulatorAdapter protocol
- Log and set the stop event when keyboard teleop fails to read the
  initial joint state instead of exiting the thread silently
- Remove unused home_pose computation in keyboard teleop
- Add coordinator test covering adapters without lifecycle methods

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR improves the xArm manipulator integration by adding activate()/deactivate() lifecycle methods to all adapters (xArm, A750, OpenArm, Piper, Sim, Mock) and wiring them into the ControlCoordinator start/stop flow. It also fixes keyboard teleop initial pose sync by reading live joint state from the stream instead of a hardcoded config value, and corrects the A/D key direction mapping.

  • Adapter lifecycle: new activate()/deactivate() hooks are called during coordinator startup and shutdown; xArm-specific implementations move the arm to a safe home pose on both enter and exit, using position mode (mode 0) with a blocking set_servo_angle call before switching to servo mode (mode 1) for teleop.
  • Teleop initial sync: _pygame_loop now blocks on joint_state.get_next(initial_state_timeout) before entering the control loop, seeding current_pose from the actual robot state; the SPACE key similarly re-syncs to the live pose.
  • Key direction fix: A/D mapping was swapped to correctly assign +Y to A and \u2212Y to D, with both the key-handler and the HUD updated consistently.

Confidence Score: 4/5

Safe to merge with one thread-lifecycle fix in the keyboard teleop module.

The coordinator lifecycle wiring and all adapter implementations are correct. The one concrete gap is in KeyboardTeleopModule: DEFAULT_THREAD_JOIN_TIMEOUT (2 s) is shorter than the newly introduced initial_state_timeout (5 s), and get_next blocks in an uninterruptible RxPY .run() call. If stop() is called before a joint state arrives, the join times out and super().stop() cleans up the module while the thread is still alive; a joint state arriving in the remaining window would cause the thread to publish on an already-torn-down stream.

dimos/teleop/keyboard/keyboard_teleop_module.py - thread join / initial-state-timeout mismatch

Important Files Changed

Filename Overview
dimos/teleop/keyboard/keyboard_teleop_module.py Switches initial joint-pose source from config to live joint_state stream (blocking get_next); adds SPACE-to-sync behavior; fixes A/D key direction. Thread join timeout (2 s) is shorter than initial_state_timeout (5 s), leaving a window where the thread publishes after module teardown.
dimos/hardware/manipulators/xarm/adapter.py Adds activate()/deactivate() with initial-pose homing (mode 0 to set_servo_angle wait=True to mode 1); _prepare_for_position_motion discards SDK return codes (previously flagged). set_state(4) correctly targets xArm stop state per SDK docs.
dimos/control/coordinator.py Wires activate()/deactivate() into startup/shutdown lifecycle; keeps write_enable fallback for adapters that pre-date this interface; deactivate loop correctly runs after tick loop is stopped.
dimos/hardware/manipulators/test_adapter_lifecycle.py New lifecycle test file covering Piper, OpenArm, and A750 adapters with mock SDKs; action sequence assertions look correct against their respective adapter implementations.
dimos/control/test_control.py Adds coordinator lifecycle integration tests; class-level mutable events list (previously flagged) is a latent issue if test class is reused.
dimos/hardware/manipulators/spec.py Adds activate()/deactivate() to the ManipulatorAdapter Protocol as optional methods; coordinator handles missing methods gracefully with getattr.
dimos/robot/manipulators/xarm/blueprints.py Passes joint_names to KeyboardTeleopModule so the name-based joint filtering path is used for xArm6 and xArm7 blueprints.
dimos/hardware/manipulators/mock/adapter.py Adds activate()/deactivate() stubs consistent with other adapters; deactivate correctly calls write_stop then write_enable(False).

Sequence Diagram

sequenceDiagram
    participant CC as ControlCoordinator
    participant XA as XArmAdapter
    participant KT as KeyboardTeleopModule

    Note over CC,XA: coordinator.start()
    CC->>XA: connect()
    XA-->>CC: True
    CC->>XA: activate()
    XA->>XA: _prepare_for_position_motion() mode 0
    XA->>XA: "_move_to_initial_pose() wait=True"
    XA->>XA: set_control_mode(SERVO_POSITION) mode 1
    XA-->>CC: True
    CC->>CC: publish joint_state 100Hz

    Note over KT: module.start()
    KT->>KT: "_read_joint_positions timeout=5s"
    CC-->>KT: JointState
    KT->>KT: JogState.from_fk(initial_joints)
    KT->>KT: cartesian_command.publish(initial_pose)
    loop teleop loop
        KT->>CC: cartesian_command PoseStamped
        CC->>XA: write_joint_positions
    end

    Note over CC,XA: coordinator.stop()
    CC->>XA: deactivate()
    XA->>XA: _prepare_for_position_motion() re-enable mode 0
    XA->>XA: "_move_to_initial_pose() wait=True"
    XA->>XA: motion_enable False + set_state 4
    XA-->>CC: True
    CC->>XA: disconnect()
Loading

Comments Outside Diff (1)

  1. dimos/teleop/keyboard/keyboard_teleop_module.py, line 107-115 (link)

    P1 Thread outlives stop() when initial joint state is slow

    DEFAULT_THREAD_JOIN_TIMEOUT is 2 s, but initial_state_timeout defaults to 5 s. _read_joint_positions blocks in observable().run() (an uninterruptible RxPY blocking call), so setting _stop_event does not interrupt it. If stop() is called before a joint state is received, join(2.0) times out and returns while the thread is still blocked in get_next(5.0). super().stop() then tears down the module. If a joint state subsequently arrives (between second 2 and second 5), the thread wakes up and calls self.cartesian_command.publish(...) on an already-stopped stream. Adding a _stop_event.is_set() guard immediately after _read_joint_positions returns, and capping initial_state_timeout to be ≤ DEFAULT_THREAD_JOIN_TIMEOUT, would close the window.

Reviews (2): Last reviewed commit: "fix: add manipulator adapter lifecycle m..." | Re-trigger Greptile

@@ -71,6 +75,8 @@ class KeyboardTeleopConfig(ModuleConfig):
ee_joint_id: int = 6
task_name: str = "cartesian_ik_arm"
home_joints: list[float] | None = None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 home_joints config field is now dead code

The home_joints field in KeyboardTeleopConfig is no longer referenced anywhere in the updated code — the initial pose now always comes from the live joint_state stream. Having an ignored config option is confusing for downstream users: they may try to set it expecting it to have an effect. Consider removing the field or adding a deprecation notice until existing callers are updated.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

return None

if joint_names := self.config.joint_names:
name_to_position = dict(zip(msg.name, msg.position, strict=False))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 zip with strict=False can silently produce a wrong name→position mapping

If a JointState message arrives with name and position lists of different lengths (e.g., a partial update), zip with strict=False silently truncates to the shorter side. The downstream name_to_position dict would then miss some entries, causing the any(name not in ...) guard below to return None, which the caller will treat as "no data" rather than "corrupt data." Using strict=True would surface this as a distinct, diagnosable error.

Suggested change
name_to_position = dict(zip(msg.name, msg.position, strict=False))
name_to_position = dict(zip(msg.name, msg.position, strict=True))

Comment on lines +197 to +198
class LifecycleAdapter(MockAdapter):
events: list[str] = []

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 events class variable is shared across all instances

events: list[str] = [] is a class-level mutable attribute. Every instance of LifecycleAdapter appends to the same list. In this particular test only one instance is created, so it passes today, but if a second adapter instance were ever created (e.g., in a retry path inside the coordinator), the assertion on line 230 would see duplicate entries and fail unexpectedly. Prefer initialising it in __init__ to make the list per-instance.

Comment on lines +276 to +287
def _prepare_for_position_motion(self) -> None:
if not self._arm:
return

if self._arm.warn_code != 0:
self._arm.clean_warn()
if self._arm.error_code != 0:
self._arm.clean_error()
self._arm.motion_enable(enable=True)
self._arm.set_mode(_XARM_MODE_POSITION)
self._arm.set_state(0)
self._control_mode = ControlMode.POSITION

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _prepare_for_position_motion silently ignores all SDK return codes

All four calls — clean_warn(), clean_error(), motion_enable(), set_mode(), and set_state() — return error codes that are discarded. If any of these fail (e.g., the arm is mid-fault and clean_error returns non-zero), the subsequent _move_to_initial_pose call will send a position command into an arm that hasn't actually been re-enabled, and activate()/deactivate() will continue rather than surfacing the failure. Checking the return values of at least motion_enable and set_state would make failures diagnosable at their root cause.

@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

@github-actions github-actions Bot added the ready-to-merge Required CI checks have passed on this PR label Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge Required CI checks have passed on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

integrate Xarm with dimos

1 participant